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Abstract 

Data  Envelopment  Analysis  (DEA) ,  a  new  methodology  based  on  linear 
programming  concepts,  provides  an  approach  to  evaluate  relative  technical 
efficiency  of  nonprofit  organizations  1)  which  have  multiple  outputs  and 
inputs  and  2)  where  the  efficient  production  function  is  not  specifiable  with 
precision.  This  paper  evaluates  the  reliability  of  DEA,  compared  with  use  of 
ratio  analysis  and  basic  regression  analysis  for  efficiency  measurement. 

DEA,  ratio  analysis,  and  regression  analysis  are  applied  to  an  artificial 
data,  in  which  the  efficient  and  inefficient  units  are  known.  Without 
knowledge  of  the  technology,  DEA  accurately  identifies  certain  inefficient 
Decision  Making  units  (DMU's)  when  outputs  and  input  are  properly  specified. 
In  contrast,  the  ratio  analysis  and  regression  analysis  techniques  are  found 
to  be  less  reliable  for  identifying  inefficient  DMU's.  The  strengths  and 
limitations  of  DEA  are  further  elaborated  to  anticipate  issues  that  may  arise 
in  subsequent  field  applications. 
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Arie  Lewin  and  the  insights  of  the  reviewers  have  been  extremely  helpful  in 
developing  the  authors  understanding  of  and  insights  about  this  research. 
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I.   Introduction 

Date  Envelopment  Analysis  (DEA)  is  a  new  efficiency  measurement 

methodology  developed  by  A.  Chames,  W.  W.  Cooper,  and  E.  Rhodes  [9]  [10]  and 

[11].  DEA  is  designed  to  measure  relative  technical  efficiency  of  Decision 

Making  Units  (DMU)  which  use  multiples  inputs  to  produce  multiple  outputs 

where  the  underlying  production  function  is  not  known  with  any  precision.  DEA 

has  already  been  applied  to  several  types  of  organizations  including  education 

[5]  [6],  health  care  [4]  [22],  Navy  recruiting  centers  [17],  and  criminal 

court  systems  [16].  Some  of  these  applications  have  been  helpful  to  managers 

of  these  organizations,  i.e.,  DEA  results  have  been  used  to  implement  changes 

to  improve  efficiency.  Nevertheless,  the  validity  and  reliability  of  DEA  in 

its  locating  of  inefficient  DMU's  has  not  been  evaluated  in  these  studies,  in 

part  because  the  identity  of  the  truly  inefficient  units  were  not  known.  This 

paper  attempts  to  evaluate  DEA  and  other  efficiency  measurement  techniques 

through  a  simulated  application  to  an  artificial  data  base  where  the  efficient 

and  inefficient  DMU's  are  known  with  certainty.  The  objective  is  to  compare 

the  accuracy  and  evaluate  the  validity  of  these  techniques  in  locating 

l 
inefficient  DMU's.   This  process  leads  to  some  strong  conclusions  about 

the  relative  strengths  and  weaknesses  of  these  techniques  and  provides  certain 

insights  about  DEA  that  may  be  useful  for  future  applications  to  real  data 

sets. 

In  this  paper,  the  efficiency  measurement  techniques  that  will  be 

evaluated  are  DEA,  basic  regression  analysis,  and  ratio  analysis.  Regression 

analysis  and  ratio  analysis  were  selected  because  they  represent  approaches 

2 

which  are  relatively  widely  known  and  accessible  to  managers.   More 
sophisticated  regression  techniques  such  as  the  flexible  functional  forms  like 
the  translog  function  are  not  considered  here  but  also  require  similar 


evaluation  and  validation  for  use  in  efficiency  measurement  as  discussed  in 
Sherman  [22],  and  Banker,  Conrad,  &  Strauss  [4].  In  other  words,  this  paper 
is  aimed  at  comparing  DEA  with  techniques  that  are  already  widely  used  rather 
than  new  methodologies  which  may  also  be  useful  for  efficiency  measurement. 

The  following  section  describes  the  data  base  constructed  to  test 
alternative  efficiency  measurement  methodologies.  Section  3  describes  the 
version  of  DEA  that  will  be  used  to  evaluate  the  efficiency  of  these  DMU's. 
In  Section  4,  the  result  of  applying  DEA,  simple  regression  analysis,  and 
ratio  analysis  are  compared  and  summarized.  Section  5  considers  other 
interpretation  that  are  available  from  DEA  as  well  as  certain  of  its 
limitations  that  are  apparent  from  this  simulation.  The  final  section 
contains  a  brief  discussion  of  other  areas  of  research  required  to  further 
validate  and  develop  the  capabilities  of  DEA  and  other  efficiency  measurement 
techniques. 
2.  Artificial  Data  Set  -  development  and  specifications 

The  artificial  data  set  is  constructed  by  defining  a  hypothetical  "known" 
technology  which  applies  to  all  Decision  Making  Units  (DMU's)  and  defines 
efficient  input-output  relationships  within  these  DMUs'  industry. 
Inefficiencies  are  introduced  for  certain  DMU's  and  take  the  form  of  excess 
inputs  used  for  the  output  level  attained.  Hence,  a  DMU  that  achieves  its 
output  level  by  using  the  amount  of  inputs  required  based  on  the  specified 
technology  is  efficient  and  a  DMU  that  uses  more  inputs  than  are  required  by 
this  technology  is  inefficient.  To  make  the  inputs  and  outputs  easier  to 
recognize,  they  are  referred  to  and  labelled  in  the  context  of  a  hospital, 
which  is  one  type  of  organization  which  uses  multiple  inputs  to  produce 

3 

multiple  outputs  and  where  techniques  like  DEA  may  prove  beneficial. 
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The  set  of  artificial  hospital  data  generated  for  our  simulation  consisted 
of  three  outputs  produced  with  three  inputs  during  a  one  year  period  of  time 
as  follows: 

Outputs  Inputs 

y-i  :  Regular  patient*  care/year  xj_:    Staff  utilized  in  terms 

(patients  treated  in  one  year  of  full-time  equivalents, 

with  average  level  of  inputs  i.e.,  (FTE's)/year 

for  treatment) 

y2  :  Severe  patient*  care/year  X2:    Number  of  hospital  bed 

(patients  treated  in  one  year  days  available/year 

with  severe  illness  requiring 
higher  input  levels  than 
regular  patients  for  more 
complex  treatment). 

yo  :  Teaching  of  residents  X3:    Supplies  in  terms  of 

and  interns/year  dollar  cost/year 

(number  of  individuals 
receiving  one  year  of  training) 

*measured  in  terms  of  number  of  patients  treated 

A  linear  input-output  model  was  used  to  specify  the  known  technology,  and 
it  was  assumed  to  be  applicable  to  all  hospitals.  That  is,  deviations  from 
this  structure  represent  inefficiencies  which  the  DEA  analysis — or  any  other 
analysis  that  might  be  used — should  be  able  to  detect. 

For  convenience  of  reference  all  details  of  the  model  and  resulting  data 
utilized  are  collected  together  in  the  Appendix.  The  model  and  the  input 
output  relationships  (and  data  used)  to  develop  the  model  parameters  are  given 
in  Exhibit  1.  These  production  relationships  are  assumed  to  hold  for  all 
volume  levels  of  operations  for  all  hospitals  each  of  which,  however,  may  use 
them  efficiently  or  inefficiently.  Input  costs  per  unit  are  also  fixed  at  the 
same  amounts  for  all  hospitals  and  remain  the  same  at  all  levels  of  operation 
so  that  the  resulting  production  activity  can  be  converted  in  to  common  dollar 
units.  It  is  also  assumed  that  all  hospitals  are  subject  to  the  same 
"production  function"  which  has  constant  returns  to  scale  in  all 
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outputs.  This  provides  the  underlying  structure  which  we  shall  henceforth 
refer  to  as  the  "structural  model." 

Via  this  "structural  model"  as  represented  in  exhibit  1,  data  were 
developed  for  an  assumed  set  of  15  hospitals  based  on  arbitrary  mixes  of 
outputs.  The  related  inputs  required  were  derived  from  the  model  and  the 
outputs.  The  resulting  data  base  which  we  shall  henceforth  use  is  shown  in 
Exhibit  2.  The  first  seven  hospitals,  H1-H7,  are  efficient;  i.e.,  the  inputs 
and  outputs  are  those  required  in  the  structural  model.  The  data  generated 
for  the  next  eight  hospitals,  H8-H15,  were  developed  by  altering  the  numerical 
values  to  portray  various  inefficiencies.  The  idea  of  course  is  to  test  the 
ability  of  DEA  and  other  methodologies  to  identify  such  inefficiencies.  These 
efficiency  measurement  techniques  would  be  accurate — at  least  as  far  as 
classification  is  concerned — if  they  isolated  H8  through  H15  as  inefficient  in 
this  applications. 

The  specific  inefficiencies  in  H8  through  H15  are  designated  by  the 
circled  (j   values  in  Exhibit  2.  That  is,  these  circled  values  refer  to  the 
specifically  inefficient  elements  in  supposed  managerial  uses  for  these 
hospitals  relative  to  what  the  production  function  requires  for  the  outputs 
they  have  achieved.  Exhibit  3  then  presents  an  example  of  how  the  data  for 
the  efficient  hospital,  HI,  and  the  inefficient  hospital,  H15,  were 
calculated.  H15  is  represented  as  a)  inefficient  in  its  use  of  inputs  to 
produce  regular  patient  care  and  b)  efficient  in  its  use  of  inputs  to  produce 
severe  patient  care  outputs  and  to  provide  training  (teaching)  outputs. 

To  effect  comparisons  such  as  we  shall  be  undertaking  in  a  multiple  output 
context,  it  is  easiest  to  proceed  from  the  other  side  and  to  locate  all 
ineff iciences  in  the  inputs  used  to  produce  whatever  output  levels  that  were 
attained.  We  can  do  this  because  of  the  linear  relationship  we  are  assuming 
since,  as  Fare  and  Lovel  [12]  have  shown,  the  kind  of  efficiency  measures  we 
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are  applying  will  give  the  same  result  from  the  output  or  the  input  side  when 

the  relations  are  linear. 

The  model  underlying  Exhibit  2  can  be  formalized  as  follows: 

3 
Xij  =  Jtxm   irj  yrj  (1) 

where    x,,  =  amount  of  input  i  used  per  year  by  hospital  j 

y  .  "  amount  of  output  r  produced  per  year  by  hospital  j 
a.  .  =   amount  of  input  i  used  per  unit  of  output  r  by  hospital  j 
during  the  year. 

We  are  positing  an  efficient  set  of  a.  .'s  which  are  the  same  for  every 
o  irJ  j 

hospital.  However,  we  retain  the  index  j  because  in  some  cases  we  will  assign 

values  a.  .  >  a.  .  for  some  i  and  r  to  represent  managerial  (=  hospital) 

inefficiencies  which  yield  values 


x,   -  Z  a.  .  y  .  (2) 

ij   r=l   J   rj 

with  Xjj  >  Xjj  when  inefficiencies  are  present. 

The  efficient  a.   values  are  given,  free  of  any  of  the  j  -  1,...,  15 
hospital  identification  subscripts,  in  Exhibit  1.  These  values  are  the  same 
for  all  hospitals  so  that  a,,  =  .004  FTE/patient  represents  the  efficient 
labor  requirement  in  full  time  equivalent  units  per  regular  patient. 
Similarly  a,„  =  .005  FTE/patient  represents  the  efficient  requirement  for  a 
severe  patient  and  a  .  =  .03  FTE/training  unit  represents  the  efficient 
requirement  to  train  one  new  resident/intern  during  a  year. 

Analogous  remarks  apply  to  the  values  a~,  ■  7  bed  days/patient,  and 
a22  =  '  ^ed  days/patient  for  regular  and  severe  patients,  respectively, 
shown  in  the  Bed  Bays  column  of  Exhibit  1.  The  blank  shown  in  the  row  for 
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Training  Units  in  this  column  means  that  a2o  ■  0  applies.  That  is,  no  bed 
days  enter  into  the  training  outputs. 

Finally,  a.   =  $20/patient  and  a_2  =  $30/patient  represent  the 
efficient  level  of  supplies  required  per  regular  and  severe  patients, 
respectively,  while  a~-  =  3500/training  unit  is  the  coefficient  for 
efficient  training  operations  in  output  r  =  3.  Putting  this  i  =  3  input  in 
dollar  units  avoids  the  detail  that  would  otherwise  be  needed  to  identify  the 
different  types  of  supplies  that  would  be  required  for  teaching  and  for 
different  types  of  patient  treatments. 

The  final  column  of  Exhibit  1  represents  the  efficient  costs  obtained  from 


Cr  =  Jfl   air  (3) 

where  we  have  omitted  the  index  j  for  hospital  identification  because  only 

efficient  costs  are  being  considered.  Here  k.  represents  the  cost  of  the 

i   input  requirement  for  the  r   output  under  efficient  operations. 

Via  the  data  shown  in  Exhibit  1  (reflected  immediately  below  the  table  in 

Exhibit  1) 

k1  =  $10,000/FTE 

k  =  $10/bed  day  (4) 

k^  =  J>l/supply  unit  (already  reflected  in  $  in  the  data  base  - 

Exhibit  3.1) 
from  which  we  obtain 

cl  =  ^lall  +  ^2a21  +  ^3a31  =  $130/regular  patient 

c2  =  k\La12  +  ^2a22  +  ^3a32  =  $170/severe  patient  (5) 

C3  =  kj_a^3  +  k2a23  +  13333  =  $500  training  unit. 

These  are  the  formulas  used  at  the  bottom  of  Exhibit  1  to  produce  the 
efficient  cost  of  outputs  shown  in  the  last  column  at  the  top. 

We  now  turn  to  Exhibit  2  which  reflects  the  composition  of  inefficient  and 
efficient  hospitals  included  in  our  data  base.  Actual  inputs  per  unit  output 
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used  by  each  hospital  are  listed  in  Exhibit  2,  columns  9-16  with  inefficient 
input  levels  per  unit  of  output  noted  by  the  \_J  *     Column  17  reflects  the 
actual  vacancy  rate  (%  of  unused  bed  days  available  during  the  year).  An 
efficient  hospital  is  expected  to  have  a  5%  vacancy  rate. 

We  develop  the  actual  inputs  used  for  each  hospital  by  selecting  an 
arbitrary  set  of  outputs:  Teaching  units  per  year  is  reflected  in  col.  6, 
regular  paitents  treated  during  the  year  are  in  col.  7,  and  severe  patients 
treated  during  the  year  are  in  col.  8.  Other  ways  of  summarizing  patient  care 
outputs  are  included  in  columns  4  and  5.  Column  4  reflects  total  patients 
which  is  the  sum  of  col.  7  and  col.  8.   Column  5  reflects  the  percentage  (%) 
of  severe  patients  treated  which  is  based  on  (col.  8)  r   (col.  4)  x  (100). 
We  develop  this  percentage  output  measure  because  it  reflects  output  data  in 
the  form  which  is  most  readily  accessible  in  many  real  data  sets  and  it  will 
allow  us  to  consider  how  such  data  may  be  used  in  our  DEA  efficiency 
evaluation  of  these  DMU's.  The  inputs  used  by  each  hospital  to  produce  the 
outputs  in  columns  6,  7,  and  8  are  reflected  in  columns  1,  2,  and  3.  Column  1 
contains  the  full  time  equivalents  (FTE's)  of  labor  years  used  by  the  hospital 
during  one  year.  Column  2  has  the  bed  days  available  per  year  to  treat 
patients.  Column  3  gives  the  supply  dollars  used  per  year.  We  clarify  how 
these  input  levels  were  calculated  by  example  in  Exhibit  3. 

Exhibit  3  illustrates  how  HI,  an  efficient  DMU,  and  H15,  an  inefficient 
DMU,  data  were  constructed.  HI  is  efficient  and  therefore  used  the  same 
inputs  per  unit  outputs  as  the  structural  model  in  Exhibit  1.  During  the 
year,  HI  provides  care  for  3000  regular  patients,  2000  severe  patients,  and  50 
training  units  of  service.   It,  therefore,  utilized  (.004) (3000)  + 
(.005K2000)  +  (.03X50)  =  23.5  FTE's  in  that  year.  H15  produced  the  same 
outputs  as  HI  but  was  inefficient  in  its  use  of  certain  inputs.   It  used  .005 


-7- 


FTE's/regular  patient,  while  it  adhered  to  the  structural  model  FTE  usage 
rates  for  severe  patients  (.005  FTE's/patient)  and  training  (.03 
FTE's/training  unit).  H15  therefore  used  (.005)0000)  +  (.005)(2000)  + 
(.03X05)  =  26,5  FTE's/year  to  produce  the  same  outputs.  Similarly,  H15  is 
inefficient  in  the  number  of  bed  days  used  and  supply  dollars  used  per  regular 
patient  and  is  efficient  in  the  amount  of  bed  days  and  supply  dollars 
consumered  for  severe  patients  and  for  supply  dollars  used  for  teaching 
outputs.  Bed  day  and  FTE's  and  supply  dollar  inputs  are  also  calculated  in 
Exhibit  3  to  further  illustrate  the  way  the  data  base  was  constructed. 

The  number  of  FTE's,  bed-days,  and  supply  dollars  inputs  were  calculated 
as  illustrated  in  Exhibit  3  for  each  hospital  based  on  the  arbitrarily 
assigned  output  mix  of  regular  patients,  severe  patients  and  training  units 
and  the  actual  efficient  or  inefficient  input  per  unit  output  rate  reflected 
in  Exhibit  2. 

Certain  relationships  posited  in  the  structural  model  are  generally  not 
known,  like  the  actual  amount  of  staff  time  and  supplies  that  are  required  to 
support  each  intern  or  residentat  a  hospital.  We  nevertheless  explicitly 
introduce  these  relationships  in  the  simulation  to  determine  if  the  efficiency 
measurement  techniques  we  will  apply  can  uncover  them  using  only  the  resulting 
input  and  output  data.  Before  proceeding,  it  should  be  noted  that  when  the 
underlying  structural  model  is  known,  the  determination  of  which  DMU's  are 
inefficient  can  be  directly  determined  and  techniques  such  as  we  will  be 
considering  would  be  unnecessary  for  purposes  of  efficiency  evaluation. 
3.  The  PEA  Model 

The  Charnes,  Cooper  and  Rhodes  (CCR)  [9]  [10]  data  envelopment  analysis 
(DEA)  technique  will  be  applied  to  the  artificial  data  set  using  the  following 
formulation  which  is  a  linear  programming  format  of  the  fractional  programming 
form  of  DEA  described  in  CCR  [9]. 
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Objective 

3 


where  o  is  the  DMU  being 
Max  hn*  °Zuy  i  *  j  .1  *v    *  * 

°      1   r  ro  evaluated  in  the  set  or 

r=1  o  -"I,... 15  DMU's. 


Subject  to    0  >_  Z     "r  yr1  -  £  v±  x±  ;   j  -  1,...,  15 


r-1  "  £*      1-1  *  W 


Z     v,  x, 
1=1  i     io 


(6) 


0  <  ur,  viJ 


r  =  1,  2,  3 
i  =  1,  2,  3 


Data:  Outputs:  y  .  ■  observed  amount  of  r   output  for  j   DMU 
Inputs:   x.  .  "  observed  amount  of  i   input  for  j   DMU 

This  application  of  linear  programming  is  designed  for  an  ex  post 
evaluation  of  how  efficient  each  DMU  was  in  its  use  of  inputs  (x.)  to 
produce  outputs  (y.)  without  explicit  knowledge  of  the  input  output 
relationship  it  used.   The  weights  in  the  form  of  u  ,  v.  are  also  not 
known  or  given  a  priori.  They  are,  instead,  calculated  as  (u  ,  v.)  values 
to  be  assigned  to  each  input  and  output  in  order  to  maximize  h  *  value  for 
the  DMU  being  evaluated  (DMU  ). 

Particular  attention  might  be  called  to  the  positivity  conditions  on  the 
variables,  which  CCR.  ensure  by  introducing  the  conditions 

e  >  u  ,  e  <  v.  for  all  r  and  i  (7) 

where  e  >  0  is  a  small  constant  which  is  so  small  that  it  cannot  otherwise 
disturb  any  solution  involving  only  real  numbers.  We  shall  use  the  value 
e  =  .001  in  this  discussion  for  numerical  convenience.  Although  still 
smaller  values  may  be  used,  a  series  of  checks  need  to  be  made  in  any  case  to 

ensure  that  the  numerical  value  assigned  to  e  does  not  alter  the  analysis 

s 

and  conclusions. 
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With  the  introduction  of  the  e  value,  the  efficiency  rating  (E)  of 
DMU  may  be  represented  as 

3        3 

E0  -  h  *  -  e  [  L     s~*  +  T.     s+*]  (8) 

r-1  r    i=l  * 

-      + 
where  s  *  and  s  *  represent  the  negative  and  positive  slack 

corresponding  to  outputs  and  inputs  in  the  optimal  linear  programming  solution 

related  to  DMU  .  Thus,  applying  DEA  to  a  set  of  DMU's  results  in  an 

efficiency  ratio  for  each  DMU  of  E  =  1  indicating  it  is  relatively  efficient 

6 

or  E  <  1  indicating  it  is  relatively  inefficient. 

The  following  analysis  proceeds  with  the  above  interpretation  of  the 
efficiency  rating  (E).  In  the  subsequent  sections,  the  implication  of  this  E 
value  and  the  constraint  that  e  >  0  are  reconsidered  in  the  context  of 
further  interpretations  of  the  DEA  results. 

Section  4  -  Results  of  Alternative  Efficiency  Measurement  Techniques; 
DEA,  Ratio  Analysis,  Basic  Regression  Analysis 


DEA,  simple  forms  of  regression  analysis,  and  ratio  analysis  are  applied 
to  the  artificial  data  set  in  Exhibit  2  to  consider  how  each  methodology 
performs  in  locating  inefficient  units. 
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Table  1 

DEA  Efficiency 

Efficiency 

Efficient  DMU's 

Rating  (E) 

Reference  Set 

HI 

1.0 

H2 

1.0 

H3 

1.0 

H4 

1.0 

H5 

1.0 

H6 

1.0 

H7 

1.0 

Inefficient 

DMU1  s 

DEA  Efficiency 
Rating  (E) 

Efficiency 
Reference  Set 

H8 

0.99 

H4 

H9 

0.98 

H2 

H10 

1.0 

Hll 

0.85 

HI,  H4,  H6 

HI  2 

0.99 

H3,  H4 

HI  3 

1.0 

H14 

0.99 

HI,  H4,  H6 

H15 

0.87 

H4,  H6.  H7 
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Application  of  PEA  to  Artificial  Data  Base. 

DEA  was  applied  to  the  15  hospitals  (DMU's)  in  the  artificial  data  base 
developed  in  Ex.  2  using  the  three  outputs  (severe  patients,  regular  patients 
and  teaching  outputs)  and  three  inputs  (fully  time  equivalents,  bed  days 
available,  and  supply  dollars).  The  DEA  efficiency  rating  is  reported  in 
Table  1. 

DEA  has  accurately  identified  six  of  the  eight  inefficient  DMU's  (E  <  1) 
as  reported  in  Table  1.  Two  ineffecienct  DMU's,  H10  and  H13  are  rated  as 
efficient  (E  =  1)  along  with  the  seven  efficient  DMU's.  The  six  DMU's  that 
are  rated  as  inefficient  have  distinct  inefficiencies  present  which  calculated 
by  DEA  by  comparison  with  certain  efficient  units  that  comprise  an  efficiency 
reference  for  the  inefficient  DMU  (see  table  1).   For  example,  H8  was  found  to 
be  inefficient  by  direct  comparison  with  H4 ;  and  H15  is  being  compared 
directly  with  H4,  H6,  and  H7.  DEA  isolates  the  efficiency  reference  set 
enroute  to  seeking  the  highest  efficiency  rating  possible  among-  the 
observational  data  subject  to  the  constraint  that  no  DMU  can  have  an 
efficiency  rating  greater  than  1.0.  Hence,  the  efficiency  rating  and 
efficiency  reference  set  are  objectively  determined  via  DEA.  We  defer  further 
interpretation  of  the  DEA  information  about  inefficient  units  till  section  5 
and  now  consider  the  merits  of  DEA  as  an  identification  tool. 

Two  of  the  inefficient  DMU's  were  not  so  identified.  This  reflects  a 
characteristic  that  may  arise  in  any  DEA  study.  A  DMU  will  be  located  as 
inefficient  only  if  it  is  found  to  be  relatively  inefficient  compared  to  other 
DMU's  in  the  data  set.  Thus,  DEA  was  not  able  to  locate  other  DMU's  against 
which  H10  and  H13  are  found  to  be  inefficient.  Another  way  of  describing  this 
result  is  that  efficient  units  are  not  necessarily  efficient  in  an  absolute 
sense.  Indeed,  some  data  sets  may  have  no  absolutely  efficient  units 
present.  This  may  be  viewed  as  a  weakness  of  DEA. 


-12- 


The  primary  strength  of  DEA  is  that  those  DMU's  which  are  rated  as 
inefficient  are  relatively  inefficient  compared  to  other  DMU's  in  the  data  set 
and  the  inefficiencies  present  can  be  identified  by  comparing  each  inefficient 
DMU  with  its  efficiency  references  set  as  we  do  in  section  5.  Hence,  DEA  is 
reliable  with  respect  to  the  inefficient  units  it  locates  but  may  not  locate 
all  the  inefficient  units  present. 
Ratio  analysis  applied  to  the  artificial  data. 

We  now  consider  how  a  manager  might  determine  which  DMU's  are  more  and 
less  efficient  using  ratio  analysis,  a  widely  used  form  of  analysis  to  evalu- 
ate financial  and  operating  performance.  In  this  example,  all  the  inputs  are 
jointly  used  by  these  DMU's  to  produce  three  outputs.  A  number  of  different 
ratios  might  be  developed  to  evaluate  different  sets  of  relationships  such  as 
FTE's/patient,  FTE's/severe  patient,  FTE's/regular  patient,  FTE's /teaching 
output,  bed  days/patient,  bed  days/severe  patient,  etc.  Such  a  set  of  ratios 
does  not  explicitly  recognize  the  joint  use  of  these  inputs  to  produce  these 
various  outputs.   In  addition,  for  the  set  of  ratios  calculated,  a  DMU  may  be 
among  the  highest  (least  efficient)  for  certain  ratios  and  lowest  (most  effi- 
cient) for  other  ratios.  This  leads  to  some  ambiguity  as  to  whether  that  DMU 
is  efficient  or  inefficient  and  calls  for  some  method  of  weighting  or  ordering 
the  importance  of  the  ratios  to  gain  some  overall  assessment  of  efficiency  as 
was  generated  using  DEA  in  Table  1. 

Rather  than  directly  address  this  issue,  we  will  focus  on  a  type  of  unit 
costing  ratio  analysis  that  is  typically  applied  to  hospitals  and  other 
organizations  to  determine  how  well  it  performs  in  evaluating  these  DMU's.  By 
design,  all  15  hospitals  (DMU's)  paid  the  same  price  per  unit  for  each  type  of 
input.  Hence  we  can  combine  the  inputs  into  dollar  units  without  the 
confounding  effect  of  differing  input  costs.  Rather  than  deal  with  all  these 
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Table  2 

Single  Outpi 

it  Me; 

isures 

Hospi 
Efficient 

tal 

Units 

Average  Cost 
per  Patient 
(A) 

Case  Mix 
Adjusted 
Average  Cost 
per  Patient 
(B) 

Case  Mix  Adjusted 

Patient  Segregated 

Levels  of  Teac 

Low* 

(C) 

Average  Cost  per 
into  High  and  Lov/ 
:hing  Outputs 
High* 
(D) 

HI 

$155.10 

(2) 

$138.48 

(4) 

$138.48 

(4) 

H2 

163.32 

(5) 

138.40 

(3) 

138.40 

(3) 

H3 

168.32 

(Z> 

142.65 

(8) 

$142.65  (3) 

H4 

160.10 

(4) 

142.94 

(9) 

142.94 

(2) 

H5 

158.38 

(3) 

137.73 

(2) 

137.73  (2) 

H6 

170.15 

(9) 

140.12 

(5) 

140.12 

(1) 

H7 

142.60 

(1) 

135.81 

(1) 

135.81  (1) 

Inefficient  Units 


H8 

176.95 

(11) 

157.99 

(12)** 

* 

157.99 

i 

(6) 

H9 

168.32 

CI) 

142.64 

(7) 

142.64 

(6) 

H10 

169.69 

(8) 

161.61 

(14)** 

161.61 

(7) 

Hll 

170.33 

(10) 

153.10 

(10) 

153.10 

(7) 

H12 

178.33 

(12) 

155.07 

(11) 

155.07 

(5) 

H13 

165.68 

(6) 

142.00 

(6) 

142.00 

(5) 

H14 

178.33 

(12) 

155.07 

(11) 

155.07 

(5) 

H15 

179.74 

(13) 

160.48 

(13)** 

160.48 

(8) 

Mean 

167.02 

146.94 

.144.77 

149.42 

Standard  Deviation 

8.82 

7.36 

9.66 

*  Low  teaching  outputs  were  50  units  and  high  teaching  outputs  were  100  units  as  per 
Exhibit  3,  Col.  6. 

**Hospitals  more  than  one  standard  deviation  over  average  cost. 
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outputs,  the  teaching  output  might  be  viewed  as  a  by-product  or  secondary 
output  and  the  patients  might  be  viewed  as  a  single  output  rather  than 
segregate  this  into  different  categories  of  severity.  This  simplifying 
procedure  is  not  defensible  from  a  cost  accounting  standpoint.  Nevertheless, 
in  the  absence  of  any  other  way  of  combining  and  weighting  the  outputs, 
similar  approaches  have  been  used  for  hospitals  as  well  as  other  types  of 
DMU's  (see  for  example  [20]). 

Table  2  column  (A)  reflects  the  average  cost  per  patient  for  each  DMU. 
This  results  in  a  ranking  of  hospitals  reflected  by  the  rank  order  in  parenthe- 
sis directly  to  the  right  of  the  average  cost  figure  in  Table  2.  The  lowest 
cost  (most  efficient)  DMU  is  ranked  1  and  highest  cost  (least  efficient)  DMU 
is  ranked  13.  This  ranking  erroneously  classifies  H13  (ranked  6)  as  more 
efficient  than  H3  (rank  7)  and  H6  (rank  9)  and  it  classifies  H9  as  more 
efficient  than  H6.  In  addition,  there  is  no  objective  means  for  determining  . 
the  cutoff  cost  level  to  segregate  efficient  and  inefficient  units. 

If  the  efficient  relative  costs  of  certain  outputs  are  known,  the  outputs 
can  be  weighted  to  reflect  a  cost  per  weighted  output  units.  In  this  case  we 
know  the  efficient  cost  of  a  regular  patient  ($130)  and  a  severe  patient 
($170)  and  the  patient  units  can  therefore  be  weighted  to  value  each  severe 
patient  as  the  equivalent  of  170/130  "'  1.3  regular  patients.  For  example, 
HI  would  have  adjusted  patient  output  units  of  3000  regular  patients  +  2000  x 
1.3  severe  patients  for  an  adjusted  total  of  5600  patients.  This  results  in 
an  adjusted  average  cost  per  patient  of  $130.40  compared  with  the  unadjusted 
cost  of  3155.10  for  HI. 

The  adjusted  cost  per  patient  is  reflected  in  column  (B)  of  Table  2  with 
the  new  ranking  in  parenthesis  immediately  to  the  right  of  the  average  cost 
per  day.  Again,  we  have  a  misranking  with  inefficient  DMU's  H9  and  H13  being 
ranked  as  more  efficient  than  H3  and  HA. 
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If  we  further  segregate  the  15  DMU's  by  the  third  output  (teaching)  and 
separate  them  based  on  those  with  high  (100  units)  versus  low  (50  units) 
teaching  outputs,  the  ranking  based  on  unit  costs  is  reflected  in  columns  C 
and  D  in  Table  2.  At  this  point,  we  have  achieved  an  accurate  ranking,  but 
again  we  have  no  objective  method  of  separating  efficient  from  inefficient 
DMU's. 

The  problem  of  locating  a  point  beyond  which  DMU's  are  considered 
inefficient  is  typically  addressed  by  establishing  a  subjective  cutoff  point, 
but  there  is  no  assurance  that  the  inefficient  units  will  be  accurately 
located  through  this  process.  For  example,  if  the  cutoff  was  set  at  one 
standard  deviation  above  the  mean  adjusted  cost  per  patient,  only  3  DMU's  (H8, 
H10  and  H15)  would  be  identified  as  inefficient  as  indicated  in  column  (B)  of 
Table  2. 

A  unit  cost  analysis  of  the  type  completed  above  could  not  have  been 
completed  if  the  efficient  cost  or  efficient  relative  cost  of  various  outputs 
were  not  known.  Such  information  is  frequently  absent  in  non-profit  settings 
such  as  health  and  education.   In  addition,  if  the  cost  of  inputs  varied  among 
the  DMU's,  this  technical  efficiency  analysis  would  be  confounded  by  the  DMU's 
purchasing  behavior  which  might  lead  to  other  inaccuracies  in  use  of  such 
ratios.  Thus,  ratio  analysis  applied  in  this  unit  costing  manner  required 
additional  information  about  the  production  and  cost  relationships  beyond  that 
required  for  DEA  and  still  provided  a  less  objective  method  of  identyfying  the 
inefficient  DMU's  than  that  achieved  using  DEA,  since  the  cutoff  of  efficient 
vs  inefficient  is  not  provided  from  the  ratio  analysis. 
Regression  analysis  applied  to  the  artificial  data  base. 

In  industries  where  the  efficient  input-output  relationship  i.e.,  the 
technology  is  not  known  with  any  precision,  regression  analysis  has  been  used 
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to  gain  insights  into  the  production  relationships  that  exist  among  the 
observed  data.  Nevertheless,  it  is  not  clear  how  traditional  regression 
analysis  can  be  used  to  evaluate  the  efficiency  of  individual  DMU's.  The 
primary  problem  is  that  the  regression  estimates  of  production  and  cost 
relationships  are  based  on  least  squares  estimates  which  provide  a  mean  or 
central  tendency  set  of  relationships  and  therefore  reflects  a  mixture  of 
efficient  and  inefficient  behavior  in  the  data  set.  Thus,  regression 
relationships  will  only  reflect  efficient  relationships  if  all  units  in  the 
study  are  themselves  efficient.  In  a  non-profit  setting,  such  as  in 
education,  health,  and  government,  where  it  is  difficult  to  justify  the 
assumption  of  efficient  behavior  due  to  lack  of  competition,  in  an  economic 
sense,  the  resulting  least  squares  regression  may  not  reflect  efficient 

7 

relationship. 

We  now  consider  the  extent  to  which  regression  analysis  can  be  used  to 
identify  the  inefficient  units  in  the  artificial  data  set.  We  again  take 
advantage  of  the  knowledge  that  all  DMU's  pay  the  same  prices  for  all  inputs 
and  attempt  to  develop  an  estimate  of  the  total  cost  as  a  function  of  the 
outputs  produced  by  each  DMU. 

We  consider  a  simple  additive  (linear)  regression  model  and  multi- 
plicative (nonlinear)  model  to  determine  what  insights  are  available  about  the 
production  function  and  about  the  efficiency  of  the  15  DMU's. 
Additive  Regression  Model 

Total  cost  was  estimated  as  a  function  of  the  quantity  of  three  outputs 
produced  by  each  DMU.  The  results  were  as  follows: 
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C  -  -95.300  +  152  yx  +  182.4  y2  +  1302  y^ 

(8)      (22.2)     (767)  (9) 

where       C  =  Total  cost  per  year 

y-  ■  8   of  regular  patients  treated  per  year 
y.  ■  #  of  severe  patients  treated  per  year 
y  °  #  Training  units  provided  in  one  year. 

2 
The  high  R  value  of  .97  suggests  a  good  fit  with  the  observational 

data,  so  by  standard  reasoning,  a  high  degree  of  cost  variation  is  explained 

by  these  independent  variables.  The  expression  yields  a  fixed  negative  cost 

estimate  of  395,300,  although  no  such  costs  appear  in  the  underlying  model. 

The  estimated  incremental  cost  of  these  outputs  versus  the  actual  or 

efficient  cost  of  the  outputs  is  summarized  below: 


Estimated 
incremental 
Cost 

Actual 
(efficient) 
incremental 
Cost  . 

$  152. 
$  182.40 
$1302. 

$130 
$170  ' 
$500 

Output 

>r3 

The  incremental  costs  differ  from  the  true  costs  and,  indeed,  the 
estimated  cost  for  y-  is  very  wide  of  the  mark. 

The  expression  (9)  represents  a  widely  used  form  of  cost  via  regression 
approach  subject  only  to  evaluation  of  the  economic  reasonableness  of  the 

relationships  and  standard  tests  of  significance  and  independence  of  right 

e 
hand  side  variables.   Although  the  estimates  may  differ  from  the  actual 

marginal  costs,  such  a  regression  may  be  useful  for  other  purposes  e.g.,  for 

2 
pure  prediction  due  to  the  high  R  .   The  use  of  the  coefficients  in  (9)  to 
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estimate  marginal  costs  is  dubious  and  so  would  be  its  use  in  evaluating  the  • 
efficiency  of  individual  hospitals.  One  approach  which  might  be  employed  is 
to  designate  inefficient  DMU's  as  those  for  which  actual  total  cost  is  some 
arbitrarily  determined  distance  above  the  estimated  total  cost.  In  this 
example,  inefficient  DMU's  H8  and  H13  have  actual  cost  which  are  near  or  below 
the  estimated  costs  and  would  not  be  considered  inefficient  based  on  this  rule 
to  locate  the  inefficient  DMU's.  All  the  efficient  DMU's  have  actual  costs 
below  the  regession  estimate  and  the  other  six  inefficient  DMU's  actual  costs 
are  above  the  regression  estimate.  Hence,  in  this  case,  the  use  of  this 
regression  estimate  with  the  identification  rule  described  above  as  a  basis 
for  locating  inefficient  units  appears  to  be  more  accurate  than  ratio 
analysis,  and  as  accurate  as  DEA,  in  that  it  located  six  inefficient  DMU's. 
Again,  the  rule  to  segregate  efficient  and  inefficient  that  yielded  this 
result  units  was  arbitrarily  established. 
Nonlinear  regression  model 

A  nonlinear  regression  model  might  be  used  instead  of  the  additive  model 
because  it  allows  for  the  possibility  of  returns  to  scale.  The  use  of  a 
multiplicative  regression  model  was  applied  to  the  15  DMU's  with  the  following 
results : 


In  C  =  3.98  +  .62  In  y  +  .57  In  y  +  .10  In  y. 
(.04)    -1  (.07)    l     (.05)    J 


or  (10) 


C  -  53.79  Y1*62  Y2*57  Y3*10 
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In  (21),  the  sum  of  the  coefficients  (.62  +  .57  +  .20)  exceeds  1  which  suggest 
the  presence  of  decreasing  returns  to  scale  i.e.,  doubling  each  output  will 
more  than  double  the  total  cost.  In  additions  there  are  partial  scale 
economies  for  increases  in  each  output  individually  holding  other  outputs 
constant.  We  know  that  the  underlying  cost  function  is  linear  in  these 

outputs  so  that  neither  of  these  effects  are  present.  Hence,  in  spite  of  the 

2 

high  R  value  of  .96,  this  nonlinear  cost  function  also  does  not  mirror  the 

true  relationship  and  conclusions  about  efficient  cost  behavior  drawn  from 
application  of  regression  techniques  to  a  set  of  efficient  and  inefficient 
DMU's  can  be  misleading. 

If  we  were  to  arbitrarily  consider  DMU's  for  which  actual  total  cost 
exceeds  the  estimated  total  cost  in  (10)  as  the  potentially  inefficient  units, 
then  efficient  DMU's  H2,  H6,  and  H7  would  be  erroneously  considered 
inefficient  and  inefficient  DMU's  Hll,  H12,  H13,  and  H14  would  be  identified  ' 
as  efficient.  Hence,  in  this  example  the  result  of  the  nonlinear  regression 
would  be  inferior  to  the  additive  regression  results,  the  ratio  analysis  and 
DEA  results.  Although  the  nonlinear  regression  results  are  less  meaningful 
than  the  linear  regression,  it  is  not  clear  that  we  could  objectively  reject. 

the  nonlinear  results  in  favor  of  the  linear  regression  results,  since  both 

2 
have  high  R  values  and  we  are  attempting  to  locate  inefficient  units  . 

without  knowing  that  the  underlying  production  and  cost  relationships  are 

linear  a  priori.   Hence,  our  ability  to  identify  inefficient  DMU's  will  be 

influenced  by  the  regression  model  we  select.  Other  more  sophisticated 

regression  techniques  which  also  reflect  mean  relationships  rather  than  the 

extermal  efficient  relationships  also  prove  inadequate  for  purposes  of 

9 

effecting  separation  between  efficient  and  inefficient  units. 
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Table  3 

Comparison  of  DEA,  ratio  analysis-,  and  linear  regression  approaches 
ability  to  locate  Inefficient  DMU's 

E  ■  DMU  rated  as  efficient 
I  -  DMU  rated  as  inefficient 

(A)  (B)  (C)  (D) 

Regression      Analysis 
DEA   (1)     Ratio   (2)    Linear  (3)   Non  linear  (3) 
Efficient  DMU's   Results       Analysis        Model Model 

HI  E  E  E  E 

H2  E  E  E  I 

H3  E  E  E  E 

H4  E  E  E  E 

.  H5  E  E  E  E 

H6  E  E  EI 

H7  E  E  EI 


Inefficient  DMU's 


H8 

I 

H9 

I 

H10 

E 

Hll 

I 

HI  2 

I 

H13 

E 

HI  4 

I 

H15 

I 

I  E  I 

E  I  I 
III 

E  I  E 

E  I  E 

E  •   E  E 

E  I  E 

I  I  I 


(1)  From  table  1 

(2)  From  table  2  column  B  -  DMU's  with  cost/patient  greater  than  one  standard 
deviation  above  the  mean. 

(3)  Based  on  rule  that  DMU's  with  actual  total  cost  greater  than  estimated 
total  cost  (based  on  the  regression  model)  are  inefficient 
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Table  3  summarizes  the  results  of  the  three  methodologies  in  locating 
inefficient  units.  We  found  that  DEA  located  six  of  the  eight  inefficient 
DMU's  as  did  the  linear  regression  model  when  the  rule  that  DMU's  with  actual 
costs  above  the  mean  (ratio  analysis)  are  deemed  inefficient  (columns  A  and  C 
in  table  2).  Ratio  analysis  located  only  3  of  the  inefficient  DMU's  (Col.  B 
of  Table  3).  Finally,  the  nonlinear  regression  approach  proved  far  less 
reliable  (see  table  2,  col.  D)  in  that  3  efficient  DMU's  were  identified  as 
inefficient. 

Based  on  these  results,  DEA  analysis  may  be  argued  to  be  the  more 
objective  and  reliable  in  locating  inefficient  DMU's  for  the  following  reasons. 

1.  Ratio  analysis  and  regression  analysis  required  an  arbitrary  rule  to 
determine  which  DMU's  would  be  designated  as  inefficient.  With  ratio 
analysis,  the  mean  might  well  have  been  lower  or  higher  depending  on 
whether  there  were  more  or  less  efficient  units  in  the  data  set. 
Similarly,  regression  analysis  might  also  have  a  lower  or  higher  cost 
curve  depending  on  the  relative  number  of  inefficient  units. 

2.  Ratio  analysis  required  added  relative  price  data  and  other  iterations 
to  address  the  multiple  output  and  input  situation  while  DEA  could 
address  this  directly  using  only  physical  output  and  input  data.  In 
addition,  the  ratios  could  easily  be  confounded  if  DMU's  paid 
different  prices  for  similar  outputs.   For  example,  a  DMU  that  had 
very  low  prices  might  have  a  lower  average  cost  which  obfuscates  the 
presence  of  technical  inefficiency  that  could  lead  to  further 
reductions  in  inputs  and  costs.  Regression  analysis  also  assumed 
DMU's  had  the  same  costs/input,  as  different  cost  structures  would 
have  shifted  the  cost  function  and  produced  different  but  not 
necessarily  more  accurate  results. 

3.  Regression  analysis  results  depended  on  the  selection  of  an 
appropriate  model  or  set  of  cost  relationships.  In  both  the  linear 
and  nonlinear  case,  cost  relationships  were  misleading  and  in  the  use 
of  the  nonlinear  model,  the  identification  of  inefficient  DMU's  was 
the  least  accurate  of  all  these  approaches. 

4.  The  added  information  about  the  nature  of  inefficiencies  with  ratio 
analysis  was  negligible  and  with  regression  analysis  it  was  misleading. 

In  the  following  section,  we  consider  the  extent  to  which  DEA  results 

provide  insights  into  the  nature  of  the  inefficiencies  located  as  well  as 

consider  other  limitations  and  strengths  of  DEA  that  need  to  be  considered  in 

assessing  DEA  versus  ratio  analysis  and  regression  analysis  for  identifying 
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inefficient  DMU's. 

4.   Interpretation  of  PEA  results 

The  application  of  DEA  to  this  artificial  data  set  highlights  two  key 
areas  that  need  to  be  considered  by  a  manager  using  DEA:  1)  data 
specification  and  2)  interpretation  of  the  efficiency  ratings. 
Data  Specification 

The  DEA  results  reliably  located  six  of  the  eight  inefficient  DMU's  in 
this  application  partly  because  the  data  were  specified  and  measured  in  a 
manner  consistent  with  the  way  the  structural  model  was  developed.  That  is, 
data  in  the  form  of  physical  inputs  and  outputs  were  incorporated  in  the  DEA 
efficiency  evaluation.   In  the  absense  of  knowledge  of  the  true  structural 
model,  it  may  be  possible  to  specify  input  and  output  data  in  more  than  one 
reasonable  and  relevant  way,  but  such  specifications  may  not  be  consistent 
with  the  true  production  function  and  may  result  in  somewhat  less  reliable  DEA 
results. 

For  example,  if  the  output  measure  used  in  the  simulation  was  patient  days 

instead  of  number  of  patients,  the  DEA  results  would  identify  DMU's  that  were 
inefficient  with  respect  to  patient  days  of  output.  For  example,  using 
patient  days  as  an  output  measure,  the  DEA  results  would  also  identify 
inefficient  DMU  H8  as  efficient.  This  result  arises  because  H8  was  not 
inefficient  with  respect  to  how  many  patient  days  of  output  provided.  H8  was 
only  inefficient  with  respect  to  how  many  days  per  patient  it  utilized. 
Hence,  the  DEA  results  using  a  patient  day  output  measure  would  yield  an 
accurate  conclusion  with  respect  to  the  output  measure  used,  but  the  result 
would  not  necessarily  be  accurate  with  respect  to  the  true  production 

function.  This  suggests  that  alternative  output  measures  need  to  be 
considered  where  there  is  ambiguity  as  to  what  input  and  output  measures 
should  be  used  for  evaluating  the  DMU's  under  study. 
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Another  related  issue  is  the  way  data  is  specified.  In  the  simulated 
application,  data  in  the  form  of  physical  input  and  output  units  were 
available.  In  many  applications,  data  may  include  a  mixture  of  physical 
measures  and  Indices.  For  example,  a  breakdown  of  how  many  patients  were 
severe  and  how  many  were  regular  (not  severe)  may  not  be  available  and  the 
data  may  only  be  available  in  the  form  of  a  total  number  of  patients  treated 
accompanied  by  an  index  of  the  severity  of  the  patients.  When  DEA  was  applied 
to  this  data  set  with  outputs  specified  as  total  patients,  percent  severe  (an 
index),  and  teaching  units  (instead  of  the  regular  patient,  severe  patient, 
and  teaching  units)  the  results  proved  less  reliable.  Efficient  hospitals  H5, 
H6,  and  H7  were  identified  as  inefficient  along  with  all  the  other  inefficient 
DMU's.  This  again  suggests  that  a  sensitivity  analysis  to  alternative  data 
specifications  is  needed  before  relying  on  the  DEA  results  and  particularly 
where  data  is  not  specified  in  the  form  of  physical  units  of  inputs  and 

1C 

outputs. 

Interpretation  of  the  efficiency  rating. 

The  efficiency  rating  E  used  in  this  application  of  DEA  defined  in  (8)  is 
useful  primarily  to  indicate  whether  the  DMU  is  inefficient  i.e.,  E  <  1. 
The  actual  value  of  E  will  depend  on  the  value  assigned  to  e  in  (7).   For 
example,  the  efficiency  rating  for  inefficient  hospital  Ml  was  .8527  with 

e  =  .OOlo  This  efficiency  rating  increases  to  .9853  with  e  =  .0001  and  to 

l  l 
.99853  with  e  -   .00001.    While  a  natural  tendency  for  any  researcher 

in  the  social  sciences  would  be  to  round  this  last  efficiency  rating  to  1.0, 

this  rating  of  .99853  is  distinct  from  1.0  in  this  type  of  analysis  and 

indicates  the  presence  of  inefficiencies  in  Hll.  This  is  illustrated  in  table 

4  where  the  output  of  the  linear  program  results  of  the  DEA  application  is 

reported  for  Hll  on  the  lower  half  of  the  table  and  an  Interpretation  of  these 

results  is  presented  above  that. 
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The  (non  zero)  shadow  prices  related  to  the  DMU  constraints  in  the  linear 
program  formulation  in  (6)  indicate  the  efficient  DMU's  against  which  the 
inefficient  DMU  is  being  most  directly  evaluated.  We  refer  to  these  efficient 
DMU's  that  form  the  basis  for  the  efficiency  rating  of  inefficient  DMU's  as 
the  efficiency  reference  set  and  these  are  listed  in  table  1  for  each 
inefficient  DMU.   In  this  example,  Hll  is  being  compared  directly  with  an 
efficiency  reference  set  of  DMU's  H4  and  H7.  The  shadow  prices  reflect  the 
weights  assigned  by  the  DEA  in  determining  the  relatively  efficient  point 
against  which  Hll  was  compared.  This  is  illustrated  by  the  columns  B,  C,  and 
D  in  table  4  where  the  input/output  vector  for  DMU's  H4  and  H7  are  each 
weighted  by  their  respective  shadow  prices  and  summed  to  yield  a  composite  of 
these  two  efficient  DMU's.  This  composite  of  2  efficient  DMU's  is  specifi- 
cally more  efficient  than  Hll.  Column  E  in  table  4  indicates  that  Hll  pro- 
duced 96.4  fewer  units  of  y,  (and  the  same  amount  of  y„  and  y_)  than  the 
composite  (col.  D)  and  used  5.1  more  units  of  x..  and  45,710  more  units  of 
x  .  Thus,  Hll  is  less  efficient  than  a  combination  of  two  other  observed 
units. 

The  amount  of  inefficiency  located  in  table  4  represents  a  type  of 
information  about  the  magnitude  and  possible  location  of  the  inefficiency 
which  distinctly  augments  the  efficiency  rating  information.  The  presentation 
in  table  4  provides  one  direct  and  managerially  understandable  way  to 
comprehend  the  magnitude  of  inef ficiences  indicated  by  the  efficiency  rating  E. 
Table  4  is  an  example  of  how  the  inefficience  located  via  DEA  are  distincly 
and  objectively  located  based  on  direct  comparison  with  other  DMU's  in  the 
data  set. 

The  information  provided  in  table  4  must  also  be  qualified  with  respect  to 
the  degree  to  which  it  can  be  literally  interpreted.  DEA  results  in  table  4 
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directly  indicate  that  a  combination  of  two  DMU's  operating  results  would 
produce  a  composite  DMU  that  is  more  efficient  than  Hll.  This  indicates  one 
way  for  Hll  to  become  more  efficient.  This  Is  not,  however,  the  only 
direction  that  Hll  can  choose  or  should  choose  to  improve  efficiency.  For 
example,  Hll  was  inefficient  with  respect  to  its  use  of  FTE's  and  supplies 
used  to  provide  severe  and  regular  patients  care  (see  exhibit  2).  The 
adjustment  required  to  make  Hll  efficient  based  on  its  actual  outputs  and  the 
structural  model  differs  from  the  adjustment  suggested  in  table  4  as  is 
indicated  in  table  5: 


Table  5 


Actual  Inputs 

and 
outputs  of  Kll 


44.5 
65,260 

265,000 


B 


Efficient  Inputs 
for  Hll  based  on 
structural  model 


36.5 
65,260 

200,000 


Adjustment 
Required  to 
become  efficient 

(B-  A) 

-8 


-£65,000 


Adjustment 
indicated  from 
table  4  based 
on  DEA  results 


-5.1 


0 
-45,710 


yl 

50 

*2 

3000 

*3 

5000 

+96.4 

0 

0 


Table  5  compares  the  true  adjustment  required  by  the  structural  model  for 
Hll  to  become  efficient  (column  C)  with  the  DEA  results  (Column  D).  In  other 
words,  Column  C  in  table  5  reflects  the  adjustments  required  for  Hll  to 
exactly  fit  the  structural  model.  The  two  sets  of  adjustments  in  Columns  C 
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and  D  are  both  mathematically  accurate  ways  for  Hll  to  become  efficient.  Note 
however,  that  only  the  solution  in  Column  D  is  available  directly  from  DEA  and 
that  the  "true"  solution  in  Column  C  is  not  available.  In  addition,  it  may  be 
impractical  or  impossible  to  make  the  adjustments  suggested  by  Column  D  in 
table  5  i.e.,  can  Hll  actually  increase  teaching  outputs  to  nearly  three  times 
its  recent  level?  This  suggests  that  the  manager  might  use  the  DEA  results  to 
indicate  of  the  presence  of  inefficiency  and  possible  areas  where  the- 
inefficiencies  lie.  The  DEA  results  suggest  alternative  paths  to  improve 
efficiency  e.g.,  in  the  above  case,  Hll  could  either  emulate  H4  or  H7  or  it 
could  aim  for  the  composite  input-output  level  suggested  in  table  4.  The 
Identification  of  preferred  and  attainable  paths  to  improve  efficiency  would 
naturally  be  based  on  managerial  judgment.  Should  this  lead  to  proposed  paths 
that  differ  from  the  one  derived  from  DEA,  it  is  also  possible  to  reapply  DEA 
for  a  sensitivity  analysis  to  determine  if  other  paths  proposed  by  management, 
would  improve  the  efficiency  compared  to  the  other  DMU's  in  the  data  set. 

5.  Conclusion 

The  simulated  application  of  various  efficiency  measurement  techniques  to 
an  artificial  data  set  to  evaluate  DMU's  with  multiple  outputs  and  inputs 
appears  to  support  the  following  conclusions: 

DEA  versus  single  regression  analysis 

DEA  more  objectively  located  inefficient  DMU's  than  simple  regression 
techniques  without  the  need  to  collapse  the  inputs  by  relative  prices  the  need 
to  specify  a  functional  form.  This  result  is  not  surprising  since  DEA  is  a 
methodology  that  was  developed  as  an  efficiency  evaluation  technique  and 
simple  regression  analysis  reflects  mean  or  central  tendency  relationships 
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which  is  more  appropriate  for  pure  prediction  (assuming  the  level  of 
inefficiency  among  the  DMU's  remainirat  the  level  in  the  observational  data) 
than  for  ascertaining  efficient  input-output  relationships. 

PEA  versus  ratio  analysis 

DEA  more  objectively  located  inefficient  DMU's  than  ratio  analysis.  Ratio 
analysis  cannot  directly  evaluate  multiple  input-multiple  output  DMU's  ,  as 
the  ratios  are  of  a  single  input  per  single  output  form.  While  ratio  analysis 
is  widely  used  to  gain  insights  into  certain  operating  relationships,  it 
nevertheless  has  severe  limitations  in  dealing  with  the  multiple  output- 
multiple  input  case  particularly  in  the  absense  of  an  objective  weighting 
system  that  can  be  used  to  collapse  these  outputs  and  inputs  into  a  single 
output  to  single  input  ratio.  After  using  added  information  to  develop  a  unit 
costing  ratio,  an  arbitrary  rule  had  to  be  adopted  to  identify  inefficient 
DMU's  with  ratio  analysis. 

Use  of  combined  techniques  to  evaluate  efficiency 

Efficiency  evaluations  may  be  most  effectively  accomplished  by  a 
combination  of  methodologies.   For  example,  DEA  might  be  used  to  select  out 
certain  of  relatively  inefficient  DMU's  after  which  regression  analysis  could 
than  be  applied  to  the  set  of  more  efficient  DMU's  to  more  accurately  estimate 
the  true  marginal  costs  of  outputs.  At  the  same  time,  DEA  was  capable  of 
locating  relatively  inefficient  DMU's  and  their  efficiency  reference  sets 
i*e.,  relatively  efficient  DMU's  against  which  the  inefficient  DMU's  were 
being  most  directly  compared.  This  provided  certain  insights  about 
alternative  paths  for  improving  the  efficiency  of  relatively  inefficient  DMU's. 
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The  DEA  results  pointed  to  various  paths  some  of  which  may  not  be  feasible 
ways  for  an  inefficient  DMU  to  improve  operations.  Once  DEA  has  reduced  the 
DMU's  of  interest  to  the  set  of  inefficient  DMU.' s  and  their  efficiency 
reference  set,  analytical  techniques  like  ratio  analysis  may  be  helpful  at 
locating  the  sources  of  the  inefficiencies  and  the  paths  that  may  be  most 
attainable  or  appropriate  for  that  DMU. 

Generalizability 

The  above  conclusions  arise  from  simulated  application  of  alternative 
methodologies  to  a  simple  data  set.  The  problems  encountered  with  ratio 
analysis  and  simple  regression  analysis  are  clearly  ones  that  argue  against 
their  usage  to  evaluate  efficiency  of  multiple  output-multiple  input  DMU's.  We 
have  indicated  that  when  DEA  locates  an  inefficient  DMU  this  determination  is 
objectively  verifiably  compared  to  other  DMU's  in  the  data  set  (see  table  5). 
This  attribute  provides  grounds  for  optimism  about  DEA's  usefulness. 
Nevertheless,  the  types  of  problems  which  may  arise  from  misspecification  of 
the  data  e.g.,  use  of  indices  or  use  of  physical  input  and  output  measures 
that  are  inconsistent  with  the  undelying  technology,  suggests  that  the 
sensitivity  to  these  specification  must  be  carefully  considered  before  one  can 
rely  on  the  results  for  managerial  purposes. 

i 

Directions  for  further  DEA  research 

DEA  has  been  shown  to  be  mathematically  sound  and  consistent  with  economic 
theory  by  Charnes,  Cooper,  and  Rhodes  [9]  [11].  In  addition,  it  has  been 
found  to  be  reliable  at  locating  inefficient  DMU's  in  this  simple  applications 
to  an  artificial  data  set.   Further  examination  of  the  DEA's  limitations 
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located  in  this  study  is  needed  and  might  include  research  into  the  following 
questions: 

a.  How  can  data  specification  for  D£A  applications  and  sensitivity 
analysis  using  DEA  be  most  effectively  accomplished  to  achieve  results 
which  are  most  consistent  with  the  underlying  technology? 

b.  Are  there  statistical  techniques  which  might  enable  one  to  assess 
DEA's  reliability  when  data  cannot  be  specified  in  physical  input  and 
output  units  and  specifically  how  can  indices  be  accomodated  in  the 
DEA  analysis? 

c.  Can  DEA  results  be  used  to  gain  insight  into  the  efficient  rates  of 
input  and  output  substitution  and  the  optimal  path  to  improve 
efficiency? 

de  In  the  data  base  used  in  this  study,  we  assumed  no  substitutability 
among  inputs  and  outputs  and  constant  return  to  scale.  How  well  would 
DEA  perform  if  these  assumptions  were  relaxed? 

e.  Are  there  relationships  between  the  number  of  inputs  and  outputs  and 
the  number  of  observations  that  would  allow  one  to  estimate  the  degree 
to  which  the  inefficient  DMU's  will  escape  identification  using  DEA. 

Overall,  the  results  of  this  study  suggest  the  DEA  is  a  promising  tool  to 

evaluate  relative  technical  efficiency  of  DMU's  with  multiple  outpus  and 

inputs  where  the  efficient  production  function  is  not  specifiable  with  any 

precision.  DEA  is  also  being  applied  to  real  data  sets  with  apparent 

acceptance  and  expanded  DEA  models  to  more  directly  deal  with  economies  of 

12 
scale  are  under  development.    At  this  point,  we  feel  that  further 

validation  of  DEA  in  field  and  simulated  settings  is  needed  to  better 

understand  the  strengths  and  limitations  and  to  support  the  expanded  use  of 

DEA  by  management  in  other  non-profit,  public  sector,  and  for  profit  settings. 
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FOOTNOTES 

We  are  primarily  concerned  with  technical  efficiency  as  distinct  from 
allocative  or  price  efficiency  which  relates  to  the  question  of  whether 
a  DMU  is  using  an  optimal  mix  of  inputs  and  purchasing  them  at  the 
lowest  price.   A  DMU  is  technically  inefficient  if  it  can  produce  more 
outputs  with  the  amount  of  inputs  it  used  or  can  produce  its  level  of 
outputs  with  fewer  inputs  than  were  used.  Hence,  a  technically 
inefficient  unit  can  reduce  inputs  and  attain  the  same  level  of  outputs 
regardless  of  any  other  types  of  efficiencies  or  inefficiencies  that  may 
be  present. 

For  example,  regression  analysis  has  been  applied  to  health  care 
organizations  in  numerous  studies,  e.g.,  Feldstein  [14],  and  Zaretsky 
[23]  and  ratio  analysis  has  been  used  by  health  regulatory  organizations 
to  locate  inefficient  hospitals  as  in  [20]. 

See  Sherman  [22]  for  a  discussion  of  hospital  efficiency  and  the  need 
for  improved  hospital  evaluation  techniques. 

The  original  form  of  the  fractional  linear  program  developed  by  CCR  [9] 
is  as  follows : 

Objective: 

s 

z  v  y 

-   r  ro 
u  -  r=l 

max  h 

o 

m 

E 

i=l 


Z  wi  xio 


Constraints: 

Less  than 
Unity      : 
Constraints 

1 

s 

<  r=l 
m 

yrj  . 

J 

Z  w. 
i-1  1 

Sij 

Positivity 
Constraints: 

0 
0 

<  yr  ;  r 

<  v±   ;    i 

=  I,- 

=  1,.. 

Data: 

I 

j  - 1, 


.m 


Outputs:   yrj  ■  observed  amount  of  r1-*1  output  for  j*-"  hospital 
Inputs:    x-ji  =  observed  amount  of  i™  input  for  j       hospital 

CCR  in  [9]  describe  the  way  the  form  we  use  in  (6)  may  be  derived  from  the 
fractional  form. 

This  methodology  builds  on  concepts  developed  by  Farrell  [13]  and  Carlson 
[7]. 
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We  performed  a  series  of  checks  to  determine  if  smaller  values  of  e 
alter  the  solution.  As  will  be  discussed  at  a  later  point  in  the 
paper,  smaller  e  values  do  not  change  the  interpretation  and 
conclusions  of  the  DEA  analysis.   Similar  tests  must  be  performed  for 
any  DEA  application  to  assure  that  the  e, value  is  sufficiently  small. 

See  CCR  [9]  for  further  discussion  of  these  slack  values  and  their 
interpretation   It  is  also  possible  to  obtain  the  same  understanding 
of  which  units  are  inefficient  by  using  e  =  o,  which  circumvents  the 
need  to  assign  a  small  nonzero  value  to  e.   In  this  case,  a  DMU  with 
a  value  of  h0*  =  1  is  efficient  if  and  only  if  there  is  zero  slack 
and  a  DMU  is  inefficient  if  h0*  =  1  and  there  is  positive  slack  or  if 
h0*  <  1.  We  prefer  to  assign  a  small  value  to  e  as  it  enables 
one  to  determine  which  DI-IU's  are  inefficient  by  reference  only  to  the  E 
(efficiency  rating)  value  and  because  it  has  been  found  to  be  more 
easily  understood  and  used  by  managers  who  need  only  use  the  simple 
rule  that  E  =  1  is  efficient  and  E  <  1  is  inefficient.  The 
expression  (18)  is  clarified  by  example  in  section  5  and  in  footnote  11. 

Similar  problems  exist  in  the  for  profit  corporate  sector,  though 
existence  of  free  market  competition  would  suggest  that  this  is  a  less 
serious  issue  since  firms  may  be  assumed  to  be  moving  toward  more 
efficient  behavior.   Nevertheless,  it  would  be  naive  to  assume  that  in 
any  real  data  set,  all  DMU's  are  efficient  so  there  will  tend  to  be 
problems  in  evaluating  efficiency  based  on  simple  regression  results. 

6 

The  independent  variables  were  found  to  have  very  low  correlation 
coefficients  as  follows: 

yi»  yi  =  -*37 
yi>  y3  =  -'°3 

Y2.  Y3  =  -«08 
Hence,  the  problem  in  the  coefficient  values  is  not  due  to  multicollinearity. 

9 

See  for  example  Sherman  [22]  where  the  flexible  functional  from 
translog  function  is  found  to  lead  to  misleading  results.  Other  types 
of  regression  techniques  which  attempt  to  estimate  extremal 
relationships  that  more  closely  approximate  efficient  production  or 
cost  functions  have  been  proposed. Forsund,  Lovell,  and  Schmidt  [15] 
review  a  variety  of  these  extremal  approaches  and  suggest  that  as 
currently  constructed,  they  have  severe  limitation  due  to  the 
assumptions  required  in  their  use.  See  also  [1],  [2],  and  [18]. 


10 


See  Sherman  [22]  for  an  extended  discussion  of  why  these  DEA  results 
shift  when  indices  are  used  to  replace  data  measured  in  physical  units. 
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11 

Specifically,  the  value  of  E  for  Hll  with  e  ■  .001  is  calculated  as 
follows  from  (8): 

3        3 

E  -  h  *  -  e  [  I    s~*  +     Z  st* 
r=l      i=l 

-  1.0  -   .001   [5.11  +  45.71  +  96.4]    ~  .853 


-*       +* 

The  slack  values,  s    and  s.  ,  for  Hll  are  reported  in  table  4. 


12 

See  for  example  Banker,  Charnes,  Cooper,  and  Schinar  [3]. 
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